Automatic Co-scheduling Based on Main Memory Bandwidth Usage
نویسندگان
چکیده
Most applications running on supercomputers achieve only a fraction of a system’s peak performance. It has been demonstrated that co-scheduling applications improves overall system utilization. In this case, however, applications being co-scheduled need to fulfill certain criteria such that mutual slowdown is kept at a minimum. In this paper we present a set of libraries and a first HPC scheduler prototype that automatically detects an application’s main memory bandwidth utilization and prevents the co-scheduling of multiple main memory bandwidth limited applications. We demonstrate that our prototype achieves almost the same performance as we achieved with manually tuned co-schedules in previous work.
منابع مشابه
A Nested Loop-Level Parallelism for DSP in Reconfigurable Computing using Forward Scheduling
Reconfigurable computing has been emerged as the co-processing in digital signal processing due to its loop-level parallelism. This paper presents some of techniques of mapping nested loops onto a coarsegrained reconfigurable architecture. Based on the generic target architecture and the limited memory bandwidth, the interconnections of processing elements are modeled as a merged expression tre...
متن کاملMemory Access The Third Dimension of Scheduling
Up to now, two internal events influence scheduling decisions of contemporary operating systems: timer events and I/O-related interrupts. Timing information supports preemption and priority adjustment. Knowledge about issued or completed I/O operations helps to wake-up sleeping processes and to boost their priority. Preferring deblocked processes at the end of I/O operations improves the intera...
متن کاملEfficient Co-Scheduling of Parallel Jobs in Cluster Computing
Co-scheduling of parallel jobs in the chips is well-known to produce benefits in both system and individual job efficiency. The existing works have shown that job co-scheduling can effectively improve the contention, yet the question on the determination of optimal co-schedules still remains unanswered. The need for co-scheduling has been typically associated with communication bandwidth and th...
متن کاملCo-scheduling on Upcoming Many-Core Architectures
Co-scheduling is known to optimize the utilization of supercomputers. By choosing applications with distict resource demands, the application throughput can be increased avoiding an underutilization of the available nodes. This is especially true for traditional multi-core architecture where a subset of the available cores are already able to saturate the main memory bandwidth. In this paper, w...
متن کاملMemory Partitioning Algorithm for Modulo Scheduling on Coarse-Grained Reconfigurable Architectures
Coarse–Grained Reconfigurable Architectures (CGRAs) have become increasingly popular because of their flexibility, performance and power efficiency. CGRAs are often used to accelerate the computation–intensive applications with their high degree of parallelism; however, the performance of CGRA is limited by the bottleneck of memory bandwidth. Scratchpad memory with multiple banks architecture c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016